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Program Description^ 



The SuccessMaker® program is a set of computer-based courses 
used to supplement regular classroom reading Instruction In 
grades K-8. Using adaptive lessons tailored to a student’s reading 
level, SuccessMaker® alms to improve understanding in areas 
such as phonological awareness, phonics, fluency, vocabulary, 
comprehension, and concepts of print. “Foundations” courses aim 
to help students develop and maintain reading skills. “Exploreware” 



courses aim to provide opportunities for exploration, open-ended 
instruction, and development of analytical skills. The computer ana- 
lyzes students’ skills development and assigns specific segments 
of the program, introducing new skills as they become appropriate. 
As the student progresses through the program, performance is 
measured by the probability of the student answering the next 
exercise correctly, which determines the next steps of the lesson.^ 



RSSSarch Three studies of SuccessMaker® meet What Works Clear- 
inghouse (WWC) evidence standards^ with reservations. The 
three studies included 450 students, ranging in age from nine 
to 16 years, who attended elementary, middle, and middle-high 
schools in Alabama, Illinois, and Virginia."^ 



Based on these three studies, the WWC considers the extent of 
evidence for SuccessMaker® to be small for alphabetics, reading 
fluency, and general literacy achievement, and medium to large 
for comprehension.® 



Eff6CtiV6neSS SuccessMaker® was found to have no discernible effects on alphabetics and reading fluency, and potentially positive effects on 
comprehension and general literacy achievement. 



Average: +1 percentile point +9 percentile points Average: +11 percentile points +11 percentile points 
Range: -8 to +5 percentile na Range: +1 to +15 percentile na 

points points 

na = not applicable 

1. The descriptive information for this program was obtained from a publicly available source: the developer’s website {http://www.pearsoned.com, downloaded December 
2008). The WWC requests developers to review the program description sections for accuracy from their perspective. Further verification of the accuracy of the descriptive 
information for this program is beyond the scope of this review. 

2. The most current version of the program is called SuccessMaker®. Earlier versions were called SuccessMaker® Enterprise and Computer Curriculum Corporation (CCC) 
SuccessMaker®. We were unable to obtain documentation on the similarities and differences between these versions from the developer. 

3. The studies included in this report were reviewed using WWC Evidence Standards, Version 1 .0 (see the WWC Standards). 

4. The evidence presented in this report is based on available research. Findings and conclusions may change as new research becomes available. 

5. A rating of “medium to large” requires at least two studies and two schools across studies in one domain and a total sample size across studies of at least 350 students 
or 14 classrooms. Otherwise, the rating is “small.” 

6. These numbers show the average and range of student-level improvement indices for all findings across the studies. 



Rating of 
effectiveness 

Improvement 

index® 



General literacy 

Alphabetics Reading fluency Comprehension achievement 

No discernible effects No discernibie effects Potentialiy positive Potentially positive 

effects effects 
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Additional program Developer and contact 

information The research underlying SuccessMaker® was initiated by 

Patrick Suppes at Stanford University during the 1960s, contin- 
ued by Mario Zanotti at the Computer Curriculum Corporation 
(Suppes and Zanotti, 1996), and extended and distributed by 
Pearson Digital Learning. Address: One Lake Street, Upper 
Saddle River, NJ 07458. Email: communications@pearsoned.com. 
Web: www.pearsoned.com. Telephone: (201) 236-7000. 

Scope of use 

According to the developers, SuccessMaker® has been used in 
more than 17,000 schools across the world. The program has 
been used with at-risk and accelerated learners, general and 
special education students, and English language learners. 

Teaching 

The software is a supplemental program that can be used in 
conjunction with existing language arts programs. “Foundations” 



courses contain basic skills-building exercises, while “Explore- 
ware” courses focus on application and literature-based reading 
aimed at building higher level analytical skills. Each student pro- 
gresses through the computerized lessons at his or her own pace. 
The proportion of instruction across concept areas is adjusted 
for the individual so that weaker areas receive more emphasis. If 
a student continually struggles with a new concept, rather than 
staying on the difficult concept, SuccessMaker® sets the material 
aside to be reintroduced at a later point. This individualization 
allows each student to progress on his or her own time schedule. 
SuccessMaker® also periodically checks the student’s recollec- 
tion of material previously mastered. Professional development 
for using SuccessMaker® is available and focuses on Instructional 
strategies to incorporate SuccessMaker® into the curricula and 
customized on-site support for teachers. 

Cost 

Not available online. 



Research Thirty-six studies reviewed by the WWC investigated the effects 
of SuccessMaker®. Three studies (Beattie, 2000; Campbell, 
2000; Gallagher, 1996), one randomized controlled trial and two 
quasi-experlmental designs, meet WWC evidence standards 
with reservations. Of the remaining studies, 33 studies do not 
meet WWC evidence standards or eligibility screens. 

Beattie (2000) conducted a randomized controlled trial of 
middle and middle-high school students in suburban northern 
Virginia. Students with language deficits, ranging in age from 11 
to 16 years, were randomly assigned by computer-generated 
procedures to one of five groups (Appendix A1.1 provides more 
details about these groups). The WWC based its effectiveness 
ratings on findings from comparisons of 14 students that 
received SuccessMaker® and 12 control group students that 
received regular reading instruction. Although these analytic 
samples were shown to be equivalent at baseline, differential 
attrition between groups led to the study’s rating of meets 



standards with reservations. The study reported student out- 
comes after two months of program implementation. 

Campbell (2000) conducted a quasi-experiment that exam- 
ined the effects of SuccessMaker® on students in upper elemen- 
tary grades in Alabama. The schools that used SuccessMaker® 
and traditional instruction [Acceierated Reader \n conjunction 
with a basal reader) were matched to schools that used only 
traditional instruction based on the intellectual ability, poverty 
level, and demographic characteristics of students in each 
school. The WWC based its effectiveness ratings on findings for 
grade 4 students: 143 students in four intervention schools and 
186 students in four comparison schools. The study reported 
student outcomes after one year of program implementation. 

Gallagher (1996) conducted a quasi-experiment that exam- 
ined the effects of SuccessMaker® on at-risk students in grades 
4-7 at an inner city elementary school in Chicago, IL. Students 
in each classroom were sorted by either reading achievement 
test score or student identification number (ID), and then 
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Research (continued) alternately assigned to treatment and control groups/ The WWC 
based its effectiveness ratings on findings from comparisons 
of the 48 students that received two reading components 
of SuccessMaker® (Readers Workshop and Reading Adventures) 
and the 47 control group students that received math compo- 
nents of SuccessMaker®. Both groups received their regular 
reading curriculum outside of the SuccessMaker® instruction. 
The study reported student outcomes after 
six weeks of program implementation. 



Extent of evidence 

The WWC categorizes the extent of evidence in each domain 
as small or medium to iarge (see the WWC Procedures and 
Standards Handbook, Appendix G). The extent of evidence 
takes into account the number of studies and the total sample 
size across the studies that meet WWC evidence standards with 
or without reservations.^ 

The WWC considers the extent of evidence for SuccessMaker® 
to be smaii for alphabetics, reading fluency, and general literacy 
achievement, and medium to large for comprehension. 



Effectiveness Findings 

The WWC review of interventions for SuccessMaker® addresses 
student outcomes in four domains: alphabetics, reading fluency, 
comprehension, and general literacy achievement. The studies 
included in this report cover all four domains. The findings below 
present the authors’ estimates and WWC-calculated estimates 
of the size and the statistical significance of the effects of 
SuccessMaker® on students.® 

Alphabetics. Beattie (2000) did not find statistically significant 
effects ot SuccessMaker® on alphabetics measures, including 
the Woodcock-Johnson subtests of Letter-Word Identification, 



Word Attack, and Auditory Processing, and the Wide Range 
Achievement Spelling subtest. The WWC-calculated average 
effect size across the four outcomes was not large enough to be 
considered substantively important according to WWC criteria 
(that is, an effect size at least 0.25).^° 

Reading fluency. Beattie (2000) did not find a statistically 
significant effect of SuccessMaker® on the Gray Oral Reading 
Test, and the effect was not large enough to be considered 
substantively important according to WWC criteria. 

Comprehension. Beattie (2000) did not find statistically 
significant effects of SuccessMaker® on the Woodcock-Johnson 



7. The authors either sorted the students by student identification numbers (ID) or Iowa Test of Basic Skills (ITBS) reading comprehension scores, and 
then assigned students to groups in an alternating fashion, but it is not clear which method was used from the text. If they sorted by student ID and 
then assigned students to groups, the assignment might be functionally random, but if they sorted by ITBS score, and always assigned students in 

an alternating fashion (starting with the treatment group, for example), the groups would be imbalanced, because they were always assigning the lower 
(or higher) scores to the treatment group. The WWC could not confirm that the assignment was truly random, as the authors had not responded to the 
WWC query at the time of publication of this review. 

8. The extent of evidence categorization was developed to tell readers how much evidence was used to determine the intervention rating, focusing on 
the number and size of studies. Additional factors associated with a related concept— external validity, such as the students’ demographics and the 
types of settings in which studies took place— are not taken into account for the categorization. Information about how the extent of evidence rating 
was determined for SuccessMaker® is in Appendix A6. 

9. The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms 
or schools and for multiple comparisons. For an explanation, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate the statistical 
significance, see WWC Procedures and Standards Flandbook, Appendix C for clustering and WWC Procedures and Standards Flandbook, Appendix D 

for multiple comparisons. In the case of Beattie (2000), a correction for multiple comparisons was needed, so the significance levels may differ from those 
reported in the original study. In the case of Campbell (2000), corrections for clustering and multiple comparisons were needed, so the significance levels 
may differ from those reported in the original study. In the case of Gallagher (1996), no corrections for clustering or multiple comparisons were needed. 

10. The WWC computes an average effect size (ES) as a simple average of the ESs across all individual findings within the study domain. For information 
on how the WWC characterizes study effects, consult the WWC Procedures and Standards Flandbook, Appendix E. 
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Effectiveness (continued) 



The WWC found 
SuccessMaker® to have 
no discernible effects on 
alphabetics and reading 
fluency, and potentially 
positive effects on 
comprehension and general 
literacy achievement 



Passage Comprehension subtest, but the effect size was iarge 
enough to be considered substantiveiy important according 
to WWC criteria (that is, an effect size at ieast 0.25). Campbeii 
(2000) did not find statisticaiiy significant effects of Success- 
Maker® on either measure of comprehension examined (the 
Stanford Achievement Reading Vocabuiary and Reading 
Comprehension subtests). The WWC-caicuiated average 
effect size across the two outcomes was not iarge enough 
to be considered substantiveiy important according to WWC 
criteria. Gaiiagher (1996) found a statisticaiiy significant effect 
of SuccessMaker® on the reading comprehension subtest of 
the iowa Test of Basic Skiiis. The WWC found that the effect 
was not statisticaiiy significant but iarge enough to be consid- 
ered substantiveiy important according to WWC criteria.'''' 



General literacy achievement. Beattie (2000) did not find statisti- 
caiiy significant effects ot SuccessMaker® on the Ciinicai Evaiuation 
of Language Fundamentais Receptive Language Score, but the 
effect size was iarge enough to be considered substantiveiy impor- 
tant according to WWC criteria (that is, an effect size at ieast 0.25). 

Rating of effectiveness 

The WWC rates the effects of an intervention in a given outcome 
domain as positive, potentiaiiy positive, mixed, no discernibie 
effects, potentiaiiy negative, or negative. The rating of effectiveness 
takes into account four factors: the quaiity of the research design, 
the statisticai significance of the findings, the size of the difference 
between participants in the intervention and the comparison condi- 
tions, and the consistency in findings across studies (see the WWC 
Procedures and Standards Handbook, Appendix E). 



Improvement index 

The WWC computes an improvement index for each individuai 
finding, in addition, within each outcome domain, the WWC 
computes an average improvement index for each study and an 
average improvement index across studies (see WWC Procedures 
and Standards Handbook, Appendix F). The improvement index 
represents the difference between the percentiie rank of the aver- 
age student in the intervention condition versus the percentiie rank 
of the average student in the comparison condition. Uniike the 
rating of effectiveness, the improvement index is based entireiy 
on the size of the effect, regardiess of the statisticai significance 
of the effect, the study design, or the anaiyses. The improvement 
index can take on vaiues between -50 and +50, with positive 
numbers denoting resuits favorabie to the intervention group.^^ 

The average improvement index for aiphabetics is +1 percen- 
tiie point (based on findings from one study), with a range of -8 
to +5 percentiie points across findings. The improvement index 



for reading fiuency is +9 percentiie points for a singie finding 
from one study. The average improvement index for comprehen- 
sion is +11 percentiie points across three studies, with a range 
of +1 to +15 percentiie points across findings. The improvement 
index for generai iiteracy achievement is +11 percentiie points for 
a singie finding from one study. 

Summary 

The WWC reviewed 36 studies on SuccessMaker®. Three of 
these studies meet WWC evidence standards with reservations. 
Of the remaining studies, 33 studies do not meet WWC evidence 
standards or eiigibiiity screens. Based on the three studies, the 
WWC found no discernibie effects in aiphabetics and reading 
fiuency, and potentiaiiy positive effects in comprehension and 
generai iiteracy achievement. The conciusions presented in this 
report may change as new research emerges. 



11. The study is not consistent in reporting the numbers of students aliocated to treatment and control groups. The WWC calculated the groups’ sample 
sizes, means, and standard deviations from the raw data presented in the study appendices. 

12. For information on how to interpret the improvement index, consult WWC Procedures and Standards Handbook, Appendix F. 
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Appendix A1.1 Study characteristics: Beattie, 2000 (randomized controiied triai with severe attrition) 



Characteristic 


Description 


Study citation 


Beattie, K. K. (2000). The effects of intensive computer-based language intervention on language functioning and reading achievement in language-impaired adolescents 
(Doctoral dissertation, George Mason University, 2000). Dissertation Abstracts International, 6?(08A), 194-3116. 


Participants 


Eighty-one students with language deficits, ranging in age from 11 to 16 years, were randomly assigned by computer-generated procedures to one of four intervention groups^ 
or to a control group in a two-step process. The researchers first assigned 1 8 students to the two intervention groups (that received a phase of SuccessMaker® and Fast 
ForWord® and also concomitantly participated in a functional resonance imaging research project). Then, the remaining participants were randomly assigned across the five 
groups. To ensure an equal distribution among groups, fewer students were placed in the first two groups at the second step of randomization. For this review, the WWC 
reported results from 14 students in the SuccessMaker® group who were compared to 12 students in the comparison group.^ Although the differential attrition rate was 
higher than 7%, the post-attrition intervention and comparison groups were equivalent on the pretest achievement measures. 


Setting 


Two middle schools and one middle-high school located in the suburbs of a large metropolitan area in northern Virginia. 


Intervention 


Students worked on SuccessMaker® for 90-94 minutes a day, five days a week. The intervention ended after each student completed 64-80 hours on the program. 
The study reported student outcomes after two months of program implementation. 


Comparison 


The control group received the standard instruction provided in the regular school curriculum. 


Primary outcomes 
and measurement 


For both pre- and posttests, the author administered the Gray Oral Reading Test, four subtests of the Woodcock-Johnson Psycho-Educational Battery (Fetter-Word Identifica- 
tion, Word Attack, Passage Comprehension, and Auditory Processing), the Spelling subtest of the Wide Range Achievement Test, and the Receptive Language subtest of the 
Clinical Evaluation of Language Fundamentals. For a more detailed description of these outcome measures, see Appendices A2.1-A2.4. 


Staff/teacher training 


No information on training for the teachers and staff in this study was provided. To facilitate the use of SuccessMaker®, computers were procured or updated to meet criteria 
for running SuccessMaker® software. 



1. The first intervention group received two phases of Fast ForWord'^; the second intervention group received two phases of SuccessMaker®\ the third and fourth intervention groups received 
a phase of Fast ForWord® and a phase of SuccessMaker®. 

2. The analysis samples for SuccessMaker® and Fast ForWord® groups were not shown to be equivalent at baseline. Two other groups, which combined SuccessMaker® and Fast ForWord®, 
are not appropriate counterfactuals, because the measures of effects cannot be attributed solely to the SuccessMaker® program. 
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Appendix A1.2 Study characteristics: Campbeii, 2000 (quasi-experimentai design) 



Characteristic 


Description 


Study citation 


Campbell, J. P. (2000). A comparison of computerized and traditional instruction in the area of elementary reading (Doctoral dissertation. University of Alabama, 2000). 
Dissertation Abstracts International, 61{03A), 77-952. 


Participants 


Based on the School Ability Index score, five elementary schools that used both SuccessMaker® and traditional instruction were matched to five elementary schools that 
used only traditional instruction.'' Poverty level and gender were similar across intervention and comparison schools. Although the overall and differential student attrition 
rates were high (58% and 37%, respectively), the post-attrition intervention and comparison samples of fourth-graders were equivalent on both subtests of the Stanford 
Achievement Test at baseline.^ After one year, 143 students in four SuccessMaker® schools and 186 students in four comparison schools remained in the sample. 


Setting 


The analysis sample included eight elementary schools in Etowah County, Alabama. 


Intervention 


Students in the intervention group received 10 to 20 minutes of SuccessMaker® instruction daily. They were also given traditional instruction that included the Accelerated 
flearfer program in conjunction with a basal reader. The study was conducted during the first year of SuccessMaker® program implementation. 


Comparison 


Comparison classrooms implemented the standard district curriculum, which used the Accelerated ffearfer program in conjunction with a basal reader. 


Primary outcomes 
and measurement 


For both pre- and posttests, the author used two subtests of the Stanford Achievement Test administered by schools. The Vocabulary and Reading Comprehension 
Otis Lennon School Ability test was also used in the study, but was not included in this report because it was outside the scope of the Adolescent Literacy review. 
For a more detailed description of the outcome measures included in this report, see Appendix A2.3. 


Staff/teacher training 


In order to maintain consistency in the administration of the outcome measure (SAT-9), all test administrators and proctors were trained in the areas of test security 
and proper administration techniques. 



1. For the overall grade 5 analysis sample, the Intervention and comparison groups were not shown to be equivalent at baseline and are, therefore, excluded from review. As a result, two schools 
were dropped from the analysis. 

2. WWC aggregated reading achievement data across schools to conduct the analyses. 
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Appendix A1.3 Study characteristics: Gaiiagher, 1996 (quasi-experimentai design) 



Characteristic 


Description 


Study citation 


Gallagher, E. M. (1996). Utilization of an ILS to increase reading comprehension (integrated learning systems, CAI) (Doctoral dissertation. Northern Illinois University, 1996). 
Dissertation Abstracts internationai, 5S(05A), 79-1 59. 


Participants 


Students in grades 4-7 were pretested using the Iowa Test of Basic Skills (ITBS), sorted by either the ITBS score or student identification number (ID), and then alternately 
assigned to treatment or control groups within classrooms.^ All of fhe students were African-American and were eligible for the federal free lunch program. Sfudenfs who 
scored below 3.0 on the reading comprehension subtest or who were part of fhe school’s special educafion program were eliminafed from the study’s sample prior to the 
assignment. Although the overall attrition rate at posttest was 38%, the post-attrition intervention and comparison groups were equivalent on the reading achievement pretest 
measure (ITBS). In all, 48 students in the SuccessMaker® group and 47 students in the comparison group were included in the analysis sample. Additional findings reflecting 
student outcomes by grade can be found in Appendix A4. 


Setting 


The study took place in an inner city elementary school in Chicago, Illinois. 


Intervention 


The intervention group spent a minimum of 40 minutes a day on the two reading components of the SuccessMaker® program. The Readers Workshop component is an 
individualized basic skill building program. In the first 100 minutes a student participates in the program, the computer analyzes their skills development and assigns specific 
segments of the program appropriate to further develop the students’ skills, introducing new skills as they become appropriate. The Reading Adventures component places 
each student at a reading level and provides stories and comprehension questions at that level. The student progresses through a semi-linear program where the only choice 
is among stories at the assigned level. Outside of the SuccessMaker® instruction, the intervention group also received the regular reading curriculum. The study reported 
students’ outcomes after six weeks of program implementation. 


Comparison 


The comparison group spent a minimum of 40 minutes a day on the math components of the SuccessMaker® program (Math Concepts and Skills and Problem Solving). 
Comparison students also received the regular reading curriculum. 


Primary outcomes 
and measurement 


For both pre- and posttests, the author used the reading comprehension subtest of fhe Iowa Test of Basic Skills. For a more defailed description of this outcome measure, 
see Appendix A2. 3. 


Staff/teacher training 


No information on training for the teachers and staff in this study was provided. 



1 . The authors either sorted the students by student identification numbers (ID) or iowa Test of Basic Skiiis (iTBS) reading comprehension scores, and then assigned students to groups in an 

aiternating fashion, but it is not ciear which method was used from the text. If they sorted by student iD and then assigned students to groups, the assignment might be functionaiiy random, but 
if they sorted by iTBS score, and aiways assigned students in an aiternating fashion (starting with the treatment group, for exampie), the groups wouid be imbaianced, because they were aiways 
assigning the iower (or higher) scores to the treatment group. The WWC couid not confirm that the assignment was truiy random, as the authors had not responded to the WWC query at the time 
of publication of this review. 
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Appendix A2.1 Outcome measures for the alphabetics domain 



Outcome measure 


Description 


Phonemic awareness 

Woodcock-Johnson 
Psycho-Educational 
Battery-Revised (WJ-R), 
Tests of Cognitive Abilities; 
Auditory Processing subtest 


This composite is a standardized measure of a student’s ability to appreciate patterns among speech-based auditory stimuli. The score is derived from scores on three 
subtests: (1) the Sound Blending subtest measures the abiiity to synthesize sequences of sounds into whoie words; (2) the Incompiete Words subtest measures the ability to 
identify a word with missing sounds; and (3) the Sound Patterns subtest measures ability to indicate whether pairs of computer-generated sound sequences are the same or 
different (as cited in Beattie, 2000). 


Phonics 

WJ-R, Tests of Achievement; 
Word Attack subtest 


This standardized subtest measures phonemic decoding skills by asking students to read pseudowords (e.g., plurp, fronkett). Students are aware that the words are not real 
(as cited in Beattie, 2000 and http://www.concordspedpac. 0 rg/WJ-lll-subtests.htm#Achievement). 


WJ-R, Tests of 
Achievement; Letter-Word 
Identification subtest 


This standardized subtest requires the student to read aloud isolated letters and real words that range in frequency and difficulty (as cited in Beattie, 2000). 


Wide Range Achievement 
Test-Third Edition (WRAT-3); 
Spelling subtest 


This standardized subtest is a paper-and-pencil task that tests students’ ability to write their names, as well as letters and words from dictation. Dictated letters and words 
followed either phonetically regular or irregular patterns (as cited in Beattie, 2000). 



Appendix A2.2 Outcome measures for the reading fluency domain 



Outcome measure 


Description 


Gray Oral Reading Test-Third 
edition (GORT-3) 


In this standardized test, students are required to read orally a variety of graded passages to measure reading rate, word identitication, and comprehension skills. 

The Passage subtest assesses a combination ot rate and accuracy. The Comprehension subtest requires a student to respond to five multiple choice questions following 
each story. The Oral Reading Quotient is reflective of a total measure of one’s oral reading performance and is calculated by combining the Passage and Comprehension 
scores (as cited in Beattie, 2000). 
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Appendix A2.3 Outcome measures for the comprehension domain 



Outcome measure 


Description 


Vocabulary development 

The Stanford Achievement 
Test (SAT-9); Reading 
Vocabulary subtest 


This standardized subtest is composed of multiple-choice and open-ended assessment questions that measure word reading and achievement.The open-ended reading sec- 
tion includes a narrative reading selection followed by nine questions. There are three types of reading selections: (1) recreationai (materiai read for enjoyment or literary merit, 
inciuding foik tales, historical fiction, contemporary fiction, humor, and poetry), (2) textual (expository material with content from the natural, physical, and social sciences, as 
well as other nonfiction general information materials) and (3) functional (material encountered in everyday life both inside and outside of school, including directions, forms, 
labels, schedules, and advertisements) (as cited in Campbell, 2000 and http://brighted.funeducation.com/Prepare/StateTests/?state=SAT-9). 


Reading comprehension 

SAT-9; Reading 
Comprehension subtest 


This standardized subtest is based on questions that range from interpreting simple sentences to understanding more complex paragraphs. The questions on complex 
paragraphs ask the student to recognize directly stated details or relationships, as well as implicit information and relationships that demand integration of what is provided 
in the text (as cited in Campbell, 2000). 


WJ-R, Tests of 
Achievement; Passage 
Comprehension subtest 


In this standardized test, comorehension is measured bv havina students fill in missinu words in a short oarauraoh (e.a.. "Woof.” said the . bitina the hand that 

fed it.) (as cited in Beattie, 2000 and http://www.concordspedpac. 0 rg/WJ-lll-subtests.htm#Achievement). 


The Iowa Test of 
Basic Skills; Reading 
Comprehension subtest 


This standardized test consists of reading passages of varying length and difficulty and assesses three types of understanding: (1) factual questions tap students’ literal 
understanding of what is stated in the text; (2) inferential/interpretive questions require students to “read between the lines” to demonstrate their understanding of what is 
implied; and (3) analysis and generalization questions require students to “step back from" the text to generalize about a passage's main points or ideas or to analyze aspects 
of the author’s viewpoint or use of language (as cited in http://www.riverpub.com/products/itbs/details.html). 



Appendix A2.4 Outcome measures for the generai iiteracy achievement domain 



Outcome measure 


Description 


Clinical Evaluation of 
Language Fundamentals- 
Third Edition (CELF-3); 
Receptive Language Score 


This standardized assessment measures a student’s ability to interpret and execute commands of increasing complexity and understand relationships between words 
and categories. It addresses sentence structure, concepts and directions, and word classes (as cited in Beattie, 2000). 
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Appendix A3.1 Summary of study findings inciuded in the rating for the aiphahetics domain^ 



Authors’ findings from the study 



Mean outcome^ 

(standard deviation)^ WWC calculations 



Outcome measure 


Study 

sample 


Sample size 
(students) 


SuccessMaker® 

group 


Comparison 

group 


Mean difference^ 
(SuccessMaker® 
- comparison) 


Effect 

size® 


Statistical 
significance® 
(at a = 0.05) 


Improvement 

index^ 






Beattie, 2000 (randomized controlled trial with attrition)^ 








WJ-R Letter-Word 
Identification subtest 


11-16 yrs old 


26 


89.69 

(9.48) 


92.08 

(13.15) 


-2.39 


-0.20 


ns 


-8 


WJ-R Word Attack subtest 


11-16 yrs old 


26 


86.99 

(17,65) 


85.91 

(12.87) 


1.08 


0.07 


ns 


+3 


WJ-R Auditory 
Processing subtest 


11-16 yrs old 


26 


87.44 

(13.38) 


85.66 

(15.61) 


1.78 


0.12 


ns 


+5 


WRAT-3 Speiiing subtest 


11-16 yrs old 


26 


87.02 

(12,66) 


85.66 

(13.13) 


1.36 


0.10 


ns 


+4 


Average for alphabetics (Beattie, 2000)^ 










0.02 


ns 


+1 



ns = not statistically significant 

WJ-R = Woodcock-Johnson Revised 

WRAT-3 = Wide Range Achievement Test-Third Edition 



1. This appendix reports findings considered for the effectiveness rating and the average improvement indices for the alphabetics domain. 

2. The intervention group values are the comparison group means plus the difference in mean gains between the intervention and comparison groups. 

3. The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants 

had more similar outcomes. 

4. Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. 

5. For an explanation of the effect size calculation, see WWC Procedures and Standards Handbook, Appendix B. 

6. Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups. 

7. The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. 

The improvement index can take on values between -50 and +50, with positive numbers denoting results favorable to the intervention group. 

8. The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple com- 
parisons. For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see WWC Procedures and 
Standards Handbook, Appendix C for clustering and WWC Procedures and Standards Handbook, Appendix D for multiple comparisons. In the case of Beattie (2000), a correction for multiple 
comparisons was needed, so the significance levels may differ from those reported in the original study. 

9. This row provides the study average, which in this instance is also the domain average. The WWC-computed domain average effect size is a simple average rounded to two decimal places. 

The domain improvement index is calculated from the average effect size. 
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Appendix A3.2 Summary of study findings inciuded in the rating for the reading fluency domain^ 









Authors’ findings from the study 














Mean outcome^ 
(standard deviation)^ 




WWC calculations 




Outcome measure 


Study 

sample 


Sample size 
(students) 


SuccessMaker® Comparison 

group group 


Mean difference^ 
(SuccessMaker® 
- comparison) 


Statistical 

Effect significance^ 

size® (at a = 0.05) 


Improvement 

index^ 



Beattie, 2000 (randomized controlled trial with attrition)^ 



Gray Oral Reading test 11-16 yrs old 

(GORT-3) 


26 


83.18 

(12,72) 


79,50 

(17,76) 


3.68 


0,23 


ns 


+9 


Average for reading fluency (Beattie, 2000)® 










0.23 


ns 


+9 



ns = not statistically significant 



1. This appendix reports findings considered for the effectiveness rating and the average improvement indices for the reading fluency domain. 

2. The intervention group values are the comparison group means plus the difference in mean gains between the intervention and comparison groups. 

3. The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants 
had more similar outcomes. 

4. Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. 

5. For an explanation of the effect size calculation, see WWC Procedures and Standards Handbook, Appendix B. 

6. Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups. 

7. The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. 
The improvement index can take on values between -50 and +50, with positive numbers denoting results favorable to the intervention group. 

8. The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple 
comparisons. For an explanation, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate the statistical significance, see WWC Procedures and Standards Handbook, 
Appendix C for clustering and WWC Procedures and Standards Handbook, Appendix D for multiple comparisons, in the case of Beattie (2000), no corrections for clustering and multiple com- 
parisons were needed. 

9. This row provides the study average, which in this instance is also the domain average. The domain improvement index is calculated from the average effect size. 
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Appendix A3.3 Summary of study findings inciuded in the rating for the comprehension domain^ 



Outcome measure 


Study 

sample 


Sample size 
(clusters/ 
students) 


Authors’ findings from the study^ 

Mean outcome^ 
(standard deviation)'* 




WWC calculations 




SuccessMaker® 

group 


Comparison 

group 


Mean difference^ 
(SuccessMaker® 
- comparison) 


Effect 

size® 


Statistical 
significance^ 
(at a = 0.05) 


Improvement 

index® 








Beattie, 2000 (randomized controlled trial with attrition)^ 








WJ-R Passage 


11-16 yrs old 


26 


97.03 


93.25 


3.78 


0.38 


ns 


+15 


Comprehension subtest 






(8.08) 


(11.30) 










Average for comprehension (Beattie, 2000)^° 










0.38 


ns 


+15 








Campbell, 2000 (quasi-experimental design)^ 








SAT-9 Reading 


Grade 4 


8/329 


60.54 


60.01 


0.53 


0.02 


ns 


+1 


Vocabulary subtest 






(23.36) 


(24.12) 










SAT-9 Reading 


Grade 4 


8/329 


60.29 


58.08 


2.21 


0.09 


ns 


+4 


Comprehension subtest 






(23.14) 


(24.76) 










Average for comprehension (Campbell, 2000)^° 










0.06 


ns 


+2 








Gallagher, 1996 (quasi-experimental design)^ 








ITBS Reading 


Grades 4-7 


95 


30.25 


26.72 


3.53 


0.36 


ns 


+14 


Comprehension subtest 






(10.78) 


(8.32) 










Average for comprehension (Gallagher, 1996)^° 










0.36 


ns 


+14 


Domain average for comprehension across all studies^” 








0.27 


na 


+11 


ns = not statistically significant 


na = not applicable 
















WJ-R = Woodcock-Johnson-Revised 


SAT-9 = Stanford Achievement Test 


ITBS = Iowa Test of Basic Skills 











1. This appendix reports findings considered for the effectiveness rating and the average improvement indices for the comprehension domain. 

2. For Gaiiagher (1996), the WWC calculated groups’ sample sizes, means, and standard deviations from the raw data presented in the study appendices. 

3. The intervention group values are the comparison group means plus the difference in mean gains between the intervention and comparison groups. 

4. The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants 
had more similar outcomes. For Campbell (2000), the WWC aggregated means and standard deviations across four schools. 

5. Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. 

6. For an explanation of the effect size calculation, see WWC Procedures and Standards Flandbook, Appendix B. 

7. Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups. 

8. The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. 
The improvement index can take on values between -50 and +50, with positive numbers denoting results favorable to the intervention group. 

9. The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple 
comparisons. For an explanation, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate the statistical significance, see WWC Procedures and Standards Flandbook, 
Appendix C for clustering and WWC Procedures and Standards Flandbook, Appendix D for multiple comparisons. In the case of Campbell (2000), corrections for clustering and multiple com- 
parisons were needed, so the significance levels may differ from those reported in the original study. In the cases of Beattie (2000) and Gallagher (1996), no corrections for clustering or multiple 
comparisons were needed. 

10. The WWC-computed domain average effect sizes for each study and for the domain across studies are simple averages rounded to two decimal places. The average improvement indices are 
calculated from the average effect size. 
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Appendix A3.4 Summary of study findings inciuded in the rating for the generai iiteracy achievement domain^ 









Authors’ findings from the study 














Mean outcome^ 
(standard deviation)^ 




WWC calculations 




Outcome measure 


Study 

sample 


Sample size 
(students) 


SuccessMaker® Comparison 

group group 


Mean difference^ 
(SuccessMaker® 
- comparison) 


Statistical 

Effect significance^ 

size® (at a = 0.05) 


Improvement 

index^ 



Beattie, 2000 (randomized controlled trial with attrition)^ 



Receptive Language subtest 1 1 -1 6 yrs old 26 

(CELF-3) 


92.81 

(18.35) 


86.63 

(22.74) 


5.98 


0.28 


ns 


+11 


Average for general literacy achievement (Beattie, 2000)® 








0.28 


ns 


+11 



ns = not statistically significant 

CELF-3 = Clinical Evaluation of Language Fundamentals-Third Edition 



1. This appendix reports findings considered for the effectiveness rating and the average improvement indices for the general literacy achievement domain. 

2. The intervention group values are the comparison group means plus the difference in mean gains between the intervention and comparison groups. 

3. The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants 

had more similar outcomes. 

4. Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. 

5. For an explanation of the effect size calculation, see WWC Procedures and Standards Handbook, Appendix B. 

6. Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups. 

7. The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. 

The improvement index can take on values between -50 and +50, with positive numbers denoting results favorable to the intervention group. 

8. The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple 
comparisons. For an explanation, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate the statistical significance, see WWC Procedures and Standards Handbook, 
Appendix C for clustering and WWC Procedures and Standards Handbook, Appendix D for multiple comparisons. In the case of Beattie (2000), no corrections for clustering and multiple com- 
parisons were needed. 

9. This row provides the study average, which in this instance is also the domain average. The domain improvement index is calculated from the average effect size. 
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Appendix A4 Summary of subgroup findings for the comprehension domain^ 









Authors’ findings from the study^ 














Mean outcome^ 
(standard deviation)'* 




WWC calculations 




Outcome measure 


Study 

sample 


Sample size 
(students) 


SuccessMaker® Comparison 

group group 


Mean difference^ 
(SuccessMaker® 
- comparison) 


Statistical 

Effect significance^ 

size® (at a = 0.05) 


Improvement 

index® 



Gallagher, 1996 (quasi-experimental design)^ 



ITBS Reading 


Grade 4 


32 


31.69 


29.19 


2.50 


0.35 


ns 


+14 


comprehension subtest 






(8.01) 


(5.53) 










ITBS Reading 


Grade 5 


32 


27.31 


25.00 


2.31 


0.22 


ns 


+9 


comprehension subtest 






(10.74) 


(9.64) 










ITBS Reading 


Grade 6 


20 


34.60 


24.80 


9.80 


0.96 


Statistically 


+33 


comprehension subtest 






(12.43) 


(6.12) 






significant 




ITBS Reading 


Grade 7 


11 


27.47 


28.20 


-0.73 


-0.05 


ns 


-2 


comprehension subtest 






(14.22) 


(14.13) 










ns = not statisticaiiy significant 
iTBS = iowa Test of Basic Skiiis 



1. This appendix presents subgroup findings for measures that faii in the comprehension domain. Totai group scores were used for rating purposes and are presented in Appendix A3. 3. 

2. For Gaiiagher (1996), the WWC caicuiated groups’ sampie sizes, means, and standard deviations from the raw data presented in the study appendices. 

3. The intervention group vaiues are the comparison group means pius the difference in mean gains between the intervention and comparison groups. 

4. The standard deviation across aii students in each group shows how dispersed the participants’ outcomes are: a smaiier standard deviation on a given measure wouid indicate that participants 

had more simiiar outcomes. 

5. Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. 

6. For an expianation of the effect size caicuiation, see WWC Procedures and Standards Fiandbook, Appendix B. 

7. Statisticai significance is the probabiiity that the difference between groups is a resuit of chance rather than a reai difference between the groups. 

8. The improvement index represents the difference between the percentiie rank of the average student in the intervention condition and that of the average student in the comparison condition. 
The improvement index can take on vaiues between -50 and +50, with positive numbers denoting resuits favorabie to the intervention group. 

9. The ievei of statisticai significance was reported by the study authors or, where necessary, caicuiated by the WWC to correct for ciustering within ciassrooms or schoois (corrections for muitipie 
comparisons were not done for findings not inciuded in the overaii intervention rating). For an expianation about the ciustering correction, see the WWC Tutoriai on Mismatch. For the formuias 
the WWC used to caicuiate statisticai significance, see WWC Procedures and Standards Handbook, Appendix C for ciustering and WWC Procedures and Standards Handbook, Appendix D for 
muitipie comparisons, in the case of Gaiiagher (1996), no correction for ciustering was needed. 
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Appendix A5.1 SuccessMaker® rating for the aiphabetics domain 

The WWC rates an intervention’s effects for a given outcome domain as positive, potentiaiiy positive, mixed, no discernible effects, potentiaiiy negative, or negativeJ 
For the outcome domain of aiphabetics, the WWC rated SuccessMaker® as having no discernible effects. It did not meet the criteria for positive, potentiaiiy positive, 
mixed, potentiaiiy negative, or negative effects because no studies showed statisticaliy significant or substantiveiy important effects, either positive or negative. 



Rating received 

No discernible effects: No affirmative evidence of effects. 

• Criterion 1: No studies showing a statisticaily significant or substantively important effect, either positive or negative. 

Met. No studies showed statisticaiiy significant or substantively important effects, either positive or negative. 

Other ratings considered 

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence. 

• Criterion 1: Two or more studies showing statisticaiiy significant positive effects, at least one of which met WWC evidence standards for a strong design. 

Not met. No studies showed statisticaiiy significant positive effects. 

AND 

• Criterion 2: No studies showing statisticaiiy significant or substantively important negative effects. 

Met. No studies showed statisticaiiy significant or substantively important negative effects. 

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence. 

• Criterion 1: At ieast one study showing a statisticaiiy significant or substantively important positive effect. 

Not met. No studies showed a statistically significant or substantiveiy important positive effect. 

AND 

• Criterion 2: No studies showing a statistically significant or substantiveiy important negative effect, and fewer or the same number of studies showing 
indeterminate effects than showing statistically significant or substantiveiy important positive effects. 

Not met. No studies showed a statistically significant or substantiveiy important negative effect, and one study showed indeterminate effects. 

Mixed effects: Evidence of inconsistent effects as demonstrated through either of the following criteria. 

• Criterion 1: At ieast one study showing a statisticaiiy significant or substantively important positive effect, and at ieast one study showing a statistically significant 
or substantiveiy important negative effect, but no more such studies than the number showing a statisticaiiy significant or substantively important positive effect. 

Not met. No studies showed a statistically significant or substantiveiy important effect, either positive or negative. 

OR 

• Criterion 2: At least one study showing a statistically significant or substantively important effect, and more studies showing an indeterminate effect than showing 
a statistically significant or substantively important effect. 

Not met. No studies showed a statistically significant or substantively important effect, and one study showed indeterminate effects. 



(continued) 
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Appendix A5.1 SuccessMaker® rating for the aiphabetics domain (continued) 



Potentially negative effects: Evidence of a negative effect with no overriding contrary evidence. 

• Criterion 1: At least one study showing a statistically significant or substantively important negative effect. 

Not met. No studies showed a statistically significant or substantively important negative effect. 

AND 

• Criterion 2: No studies showing a statistically significant or substantively important pos/f/Ve effect, or more studies showing statistically significant or substantively 
important negative effects than showing statistically significant or substantively important positive effects. 

Met. No studies showed statistically significant or substantively important positive effects. 

Negative effects: Strong evidence of a negative effect with no overriding contrary evidence. 

• Criterion 1: Two or more studies showing statistically significant negative effects, at least one of which met WWC evidence standards for a strong design. 

Not met. No studies showed statistically significant or substantively important negative effects. 

AND 

• Criterion 2: No studies showing statistically significant or substantively important positive effects. 

Met. No studies showed statistically significant or substantively important positive effects. 

1. For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of 
potentially positive or potentially negative effects. For a complete description, see the WWC Procedures and Standards Handbook, Appendix E. 
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Appendix A5.2 SuccessMaker® rating for the reading fluency domain 

The WWC rates an intervention’s effects for a given outcome domain as positive, potentiaiiy positive, mixed, no discernible effects, potentiaiiy negative, or negativeJ 
For the outcome domain of reading fiuency, the WWC rated SuccessMaker® as having no discernible effects. It did not meet the criteria for positive, potentiaiiy positive, 
mixed, potentiaiiy negative, or negative effects because no studies showed statisticaliy significant or substantiveiy important effects, either positive or negative. 



Rating received 

No discernible effects: No affirmative evidence of effects. 

• Criterion 1: No studies showing a statisticaily significant or substantively important effect, either positive or negative. 

Met. No studies showed statisticaiiy significant or substantively important effects, either positive or negative. 

Other ratings considered 

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence. 

• Criterion 1: Two or more studies showing statisticaiiy significant positive effects, at least one of which met WWC evidence standards for a strong design. 

Not met. No studies showed statisticaiiy significant positive effects. 

AND 

• Criterion 2: No studies showing statisticaiiy significant or substantively important negative effects. 

Met. No studies showed statisticaiiy significant or substantively important negative effects. 

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence. 

• Criterion 1: At ieast one study showing a statisticaiiy significant or substantively important positive effect. 

Not met. No studies showed a statistically significant or substantiveiy important positive effect. 

AND 

• Criterion 2: No studies showing a statistically significant or substantiveiy important negative effect, and fewer or the same number of studies showing 
indeterminate effects than showing statistically significant or substantiveiy important positive effects. 

Not met. No studies showed a statistically significant or substantiveiy important negative effect, and one study showed indeterminate effects. 

Mixed effects: Evidence of inconsistent effects as demonstrated through either of the following criteria. 

• Criterion 1: At ieast one study showing a statisticaiiy significant or substantively important positive effect, and at least one study showing a statistically significant 
or substantively important negative effect, but no more such studies than the number showing a statistically significant or substantively important pos/f/Ve effect. 

Not met. No studies showed a statistically significant or substantively important effect, either positive or negative. 

OR 

• Criterion 2: At least one study showing a statistically significant or substantively important effect, and more studies showing an indeterminate effect than showing 
a statistically significant or substantively important effect. 

Not met. No studies showed a statistically significant or substantively important effect, and one study showed indeterminate effects. 
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Appendix A5.2 SuccessMaker® rating for the reading fluency domain (continued) 



Potentially negative effects: Evidence of a negative effect with no overriding contrary evidence. 

• Criterion 1: At least one study showing a statistically significant or substantively important negative effect. 

Not met. No studies showed a statistically significant or substantively important negative effect. 

AND 

• Criterion 2: No studies showing a statistically significant or substantively important pos/f/Ve effect, or more studies showing statistically significant or substantively 
important negative effects than showing statistically significant or substantively important positive effects. 

Met. No studies showed statistically significant or substantively important positive effects. 

Negative effects: Strong evidence of a negative effect with no overriding contrary evidence. 

• Criterion 1: Two or more studies showing statistically significant negative effects, at least one of which met WWC evidence standards for a strong design. 

Not met. No studies showed statistically significant or substantively important negative effects. 

AND 

• Criterion 2: No studies showing statistically significant or substantively important positive effects. 

Met. No studies showed statistically significant or substantively important positive effects. 

1. For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of 
potentially positive or potentially negative effects. For a complete description, see the WWC Procedures and Standards Handbook, Appendix E. 
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Appendix A5.3 SuccessMaker® rating for the comprehension domain 

The WWC rates an intervention’s effects for a given outcome domain as positive, potentiaiiy positive, mixed, no discernible effects, potentiaiiy negative, or negativeJ 
For the outcome domain of comprehension, the WWC rated SuccessMaker® as having potentially positive effects. 



Rating received 

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence. 

• Criterion 1; At ieast one study showing a statisticaiiy significant or substantively important positive effect. 

Met. Two studies showed substantiveiy important positive effects. 

AND 

• Criterion 2: No studies showing a statistically significant or substantiveiy important negative effect, and fewer or the same number of studies showing 
indeterminate effects than showing statistically significant or substantiveiy important pos/f/Ve effects. 

Met. No studies showed a statistically significant or substantiveiy important negative effect, and one study showed indeterminate effects. 

Other ratings considered 

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence. 

• Criterion 1: Two or more studies showing statisticaiiy significant positive effects, at least one of which met WWC evidence standards for a strong design. 

Not met. No studies showed statisticaiiy significant positive effects. 

AND 

• Criterion 2: No studies showing statisticaiiy significant or substantively important negative effects. 

Met. No studies showed statisticaiiy significant or substantively important negative effects. 

1. For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of 
potentially positive or potentially negative effects. For a complete description, see the WWC Procedures and Standards Handbook, Appendix E. 
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Appendix A5.4 SuccessMaker® rating for the generai iiteracy achievement domain 

The WWC rates an intervention’s effects for a given outcome domain as positive, potentiaiiy positive, mixed, no discernible effects, potentiaiiy negative, or negativeJ 
For the outcome domain of generai literacy achievement, the WWC rated SuccessMaker® as having potentiaiiy positive effects. 



Rating received 

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence. 

• Criterion 1; At ieast one study showing a statisticaiiy significant or substantively important positive effect. 

Met. One study showed a substantively important positive effect. 

AND 

• Criterion 2: No studies showing a statistically significant or substantively important negative effect, and fewer or the same number of studies showing 
indeterminate effects than showing statistically significant or substantively important pos/f/Ve effects. 

Met. No studies showed a statistically significant or substantively important negative effect, and one study showed indeterminate effects. 

Other ratings considered 

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence. 

• Criterion 1: Two or more studies showing statistically significant positive effects, at least one of which met WWC evidence standards for a strong design. 

Not met. No studies showed statistically significant positive effects. 

AND 

• Criterion 2: No studies showing statistically significant or substantively important negative effects. 

Met. No studies showed statistically significant or substantively important negative effects. 

1. For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of 
potentially positive or potentially negative effects. For a complete description, see the WWC Procedures and Standards Handbook, Appendix E. 
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Appendix A6 Extent of evidence by domain 



Outcome domain 


Number of studies 


Schools 


Sample size 

Students 


Extent of evidence^ 


Alphabetics 


1 


3 


26 


Small 


Reading fluency 


1 


3 


26 


Small 


Comprehension 


3 


12 


450 


Medium to large 


General literacy achievement 


1 


3 


26 


Small 



1 . A rating of “medium to iarge” requires at ieast two studies and two schoois across studies in one domain and a totai sampie size across studies of at ieast 350 students or 14 ciassrooms. 
Otherwise, the rating is “smaii.” For more detaiis on the extent of evidence categorization, see the WWC Procedures and Standards Handbook, Appendix G. 
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